Skip to content

Add TaskExecutor conformance to EventLoops #2732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

fabianfett
Copy link
Member

@fabianfett fabianfett commented May 29, 2024

Add TaskExecutor conformance to EventLoops

  • Support NIO Event Loops as task executors

Motivation

Swift 6 introduces the option to supply your own task executors to the Swift concurrency runtime (SE-0417). SwiftNIO's EventLoop should conform to the new TaskExecutor protocol to support running Swift concurrency Tasks on it. This will allow fewer context switches in Swift on server applications.

Modifications

  • Extend EventLoop to expose an taskExecutor property (the good executor name is sadly already taken by the SerialExecutor)
  • Rename NIODefaultSerialEventLoopExecutor to NIODefaultEventLoopExecutor
  • NIODefaultEventLoopExecutor now also implements TaskExecutor protocol
  • Expose NIOTaskEventLoopExecutor protocol that makes it easy for adopters to get a better implementation
  • Add test

Result

Adopters can use ELs as task executors

@fabianfett fabianfett marked this pull request as draft May 29, 2024 12:45
@fabianfett fabianfett force-pushed the ff-nio-task-executor branch 2 times, most recently from c48fcba to a982359 Compare May 29, 2024 13:18
// MARK: TaskExecutor conformance
#if compiler(>=6.0)
@available(macOS 9999.0, iOS 9999.0, watchOS 9999.0, tvOS 9999.0, *)
extension SelectableEventLoop: NIOTaskEventLoopExecutor { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add conformance for AsyncTestingEventLoop and a fatalErroring conformance for EmbeddedEventLoop?

let loop1 = iterator.next()!
let loop2 = iterator.next()!

await self._runTests(loop1: loop1, loop2: loop2)
Copy link
Member Author

@fabianfett fabianfett May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test currently crashes:

Thread 4 Crashed:: NIO-ELT-0-#0
0   libswiftCore.dylib            	       0x19a319644 swift_getObjectType + 88
1   libswift_Concurrency.dylib    	       0x2617ea0e4 swift_task_enqueueImpl(swift::Job*, swift::SerialExecutorRef) + 160
2   libswift_Concurrency.dylib    	       0x2617ec890 swift::AsyncTask::flagAsAndEnqueueOnExecutor(swift::SerialExecutorRef) + 432
3   libswift_Concurrency.dylib    	       0x2617eacdc swift_task_switchImpl(swift::AsyncContext*, void (swift::AsyncContext* swift_async_context) swiftasynccall*, swift::SerialExecutorRef) + 640
4   swift-nioPackageTests         	       0x1080d5ca1 partial apply for closure #1 in closure #1 in TaskExecutorTests._runTests<A, B>(loop1:loop2:) + 1
5   swift-nioPackageTests         	       0x1080d4d3d thunk for @escaping @isolated(any) @callee_guaranteed @Sendable @async () -> (@out A) + 1
6   swift-nioPackageTests         	       0x1080d4e99 partial apply for thunk for @escaping @isolated(any) @callee_guaranteed @Sendable @async () -> (@out A) + 1
7   libswift_Concurrency.dylib    	       0x2617f1921 completeTaskWithClosure(swift::AsyncContext*, swift::SwiftError*) + 1

Why do we call swift_task_switchImpl with a SerialExecutorRef?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch is what checks if it can "switch" without releasing actor locks; it won't be able to switch when there's a task executor. I'll look into reproducing this though with a debug runtime to check.

@FranzBusch
Copy link
Member

Can we also add another benchmark for the NIOAsyncChannel that uses the task executor next to the global executor hook to prove that this indeed does what we want it to do.

@fabianfett fabianfett force-pushed the ff-nio-task-executor branch from ba8f9fb to 6018f07 Compare June 4, 2024 09:22
@fabianfett
Copy link
Member Author

fabianfett commented Jun 4, 2024

Performance results on my machine:

==================
NIOPosixBenchmarks
==================

TCPEcho pure NIO 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches        │    4370 │    4370 │    4370 │    4370 │    4370 │    4370 │    4370 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) *        │     428 │     428 │     428 │     428 │     428 │     428 │     428 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ms) * │    4537 │    4537 │    4537 │    4537 │    4537 │    4537 │    4537 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TCPEchoAsyncChannel pure async/await 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches (K)    │     148 │     148 │     148 │     148 │     148 │     148 │     148 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) (K) *    │    1492 │    1492 │    1492 │    1492 │    1492 │    1492 │    1492 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ms) * │    8431 │    8431 │    8431 │    8431 │    8431 │    8431 │    8431 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TCPEchoAsyncChannel using globalHook 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches        │    9498 │    9498 │    9498 │    9498 │    9498 │    9498 │    9498 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) (K) *    │     221 │     221 │     221 │     221 │     221 │     221 │     221 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ms) * │    9011 │    9011 │    9011 │    9011 │    9011 │    9011 │    9011 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TCPEchoAsyncChannel using task executor preference 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches        │    8352 │    8352 │    8352 │    8352 │    8352 │    8352 │    8352 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) (K) *    │    6175 │    6175 │    6175 │    6175 │    6175 │    6175 │    6175 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ms) * │    9336 │    9336 │    9336 │    9336 │    9336 │    9336 │    9336 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

@fabianfett fabianfett force-pushed the ff-nio-task-executor branch from 8ec67a1 to cb67cb2 Compare April 2, 2025 23:38
@fabianfett fabianfett added the 🆕 semver/minor Adds new public API. label Apr 3, 2025
@fabianfett
Copy link
Member Author

fabianfett commented Apr 3, 2025

Updated the PR for 6.1:

TCPEcho pure NIO 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches        │    2548 │    2548 │    2548 │    2548 │    2548 │    2548 │    2548 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) *        │     137 │     137 │     137 │     137 │     137 │     137 │     137 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ms) * │    3815 │    3815 │    3815 │    3815 │    3815 │    3815 │    3815 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TCPEchoAsyncChannel pure async/await 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches (K)    │     159 │     159 │     159 │     159 │     159 │     159 │     159 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) (K) *    │    1474 │    1474 │    1474 │    1474 │    1474 │    1474 │    1474 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (ms) * │    7376 │    7376 │    7376 │    7376 │    7376 │    7376 │    7376 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TCPEchoAsyncChannel using globalHook 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches (K)    │      20 │      20 │      20 │      20 │      20 │      20 │      20 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) (K) *    │     176 │     176 │     176 │     176 │     176 │     176 │     176 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (s) *  │      13 │      13 │      13 │      13 │      13 │      13 │      13 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

TCPEchoAsyncChannel using task executor preference 1M times
╒═════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕
│ Metric                  │      p0 │     p25 │     p50 │     p75 │     p90 │     p99 │    p100 │ Samples │
╞═════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡
│ Context switches (K)    │      17 │      17 │      17 │      17 │      17 │      17 │      17 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Malloc (total) (M) *    │      17 │      17 │      17 │      17 │      17 │      17 │      17 │       1 │
├─────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Time (total CPU) (s) *  │      15 │      15 │      15 │      15 │      15 │      15 │      15 │       1 │
╘═════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

Looks like the numbers got worse.

➜  swift-nio git:(ff-nio-task-executor) swift --version
swift-driver version: 1.120.5 Apple Swift version 6.1 (swiftlang-6.1.0.110.5 clang-1700.0.13.3)
Target: arm64-apple-macosx15.0

@fabianfett
Copy link
Member Author

I hope that this PR will improve the numbers:
swiftlang/swift#80316

@fabianfett fabianfett requested a review from Lukasa April 3, 2025 09:20
@fabianfett fabianfett marked this pull request as ready for review April 3, 2025 11:24
Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding an explicit "not now" review. Task executors intersect very poorly with region-based isolation right now, so holding off on this is the lesser of two evils.

@FranzBusch
Copy link
Member

Just adding an explicit "not now" review. Task executors intersect very poorly with region-based isolation right now, so holding off on this is the lesser of two evils.

I just want to expand on this in case anyone sees this comment and assumes they should avoid task executors. That is certainly not the case. Task executors and region based isolation work together nicely; however, the problem is that swift concurrency has no way to represent a value that’s tied to a specific thread unless that thread is modeled with an isolation domain (actor or Sendable closures). This is a common pattern in NIO though which often makes runtime checks that code is executing in the right event loops before allowing changes or returning event loop bound state. Doing this directly from asynchronous swift code is only possible when using custom actor executors, or task executors or by hooking the global executor. Arguably we have already allowed two (actor and global) of the three potentially dangerous patterns here.

So in general when writing pure Swift concurrency code using task executors with region isolation is completely safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🆕 semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants